By the end of this session, you should be able to:
R does exactlly what you tell it to do, rather than what you want it to do
-Kieren Healy
Ctrl/Cmd+Shift+F10), clear the console (Ctrl/Cmd+L), and clear your workspaceLast week you created a project for this workshop. Is this project still open? If not, click on the project icon to load it. (Don’t create a new one.)
In your console run:
getwd()If you successfully created a project and have this project open, the working directory should be your project directory
Every file reference should be relative to this working (root) directory.
../ goes up one level../../ goes up two levelssubdirectory/ does down one levelsubdirectory/subsubdirectory/ goes down two levelsIdentify where you want to save today’s template file (e.g., products).
Change products to your preferred subfolder.
download.file("https://raw.githubusercontent.com/
ericpgreen/IEatDataScience/master/labs/lab-w02.Rmd",
destfile = "products/lab-w02.Rmd")When you knit a document, RStudio thinks that the directory where the document is saved should be the working directory. If your template is stored in root/products, for instance, products will become the working directory and relative file paths won’t work. To fix this, we added the following to the first R code chunk:
opts_knit$set(root.dir=normalizePath('../'))This says, “Yo Knitr, the working directory is one level up from this file.”
Turn to the back of your Markdown cheatsheet and try writing some text under the “Markdown Practice” heading (e.g., bold, italics, lists, subheadings, web links, footnote). Then click “Knit” to compile the document.
The data files we want to import are sitting on Dropbox. We could import directly into the R session, but let’s first download the files to your project directory and then load into R. The loading step will teach you how read in files from your local machine.
input with the name of the folder where you want to store raw data.{r} with {r, eval=FALSE} to prevent R from trying to download the files over and over again. download.file("https://www.dropbox.com/s/k4d2j6feayezkun/r2d2.csv?dl=1",
destfile = "input/r2d2-w02.csv")
download.file("https://www.dropbox.com/s/1e2tqqmfzmzaybe/r2d2.dta?dl=1",
destfile = "input/r2d2-w02.dta")
download.file("https://www.dropbox.com/s/6b8t2c877yvqeax/r2d2.rds?dl=1",
destfile = "input/r2d2-w02.rds")
download.file("https://www.dropbox.com/s/beftqecfs4vvuss/r2d2.sas?dl=1",
destfile = "input/r2d2-w02.sas")
download.file("https://www.dropbox.com/s/tiqgit23db2fk8x/r2d2.txt?dl=1",
destfile = "input/r2d2-w02.text")
download.file("https://www.dropbox.com/s/7a3i7e0bfmqiikr/r2d2.xlsx?dl=1",
destfile = "input/r2d2-w02.xlsx")A csv file is an ideal format for sharing data. Simple. Lightweight. Readable by any program. Import with the read.csv() function. Start by running ?read.csv in the console to view the help file.
What arguments are required?
Now import the csv file:
datCSV <- read.csv("input/r2d2-w02.csv", stringsAsFactors = FALSE)This will create an object called datCSV.
You can name this object anything you want. The most important thing is to be consistent. Here’s Hadley’s advice:
foo or Eric2datWave1_widestr(datCSV)R has several data types. datCSV is a data frame that consists of 372 rows and 13 variables. Let’s use two built-in functions to do this count and print the results in line. Go to where you see the following line:
The
datCSVdata frame has … observations (rows) and … columns.
Replace the first ... with `nrow(datCSV)` for the number of rows, and replace the second ... with `length(datCSV)` for the number of columns. Then knit your document.
We can also examine datCSV with the glimpse() function in the dplyr package, which is included in the tidyverse.
library(tidyverse)
glimpse(datCSV)If you get an error that there is no function called glimpse, run install.packages("tidyverse") in the console.
Knit the file and you’ll see that the glimpse results print. Replace {r glimpse} with {r glimpse, results='hide'} and knit again.
To turn off code printing, replace {r glimpse, results='hide'} with {r glimpse, results='hide', echo=FALSE} and knit again.
You can also use the functions head() or tail() to examine the first or last few rows.
head(datCSV)
tail(csv)Use the kableExtra package to print nice HTML tables. Use this code and knit to HTML this time.
library(knitr)
library(kableExtra)
head(datCSV[, 1:6]) %>%
kable("html") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = F)If you get an error that there is no function called kable_styling, run install.packages("kableExtra") in the console.
Pipes %>% and [rows, columns]
library(foreign)
library(tidyverse)
library(“googledrive”) drive_find(n_max = 25)
glimpse() group_by() mutate() summarise() n()
defmodule Math do
def sum(a, b) do
a + b
end
end
paste0() loops